SEQUENCE LISTING 



(1) GENERAL INFORMATION 
(1) APPLICANT: Sheppard, Paul 0. 

(II) TITLE OF THE INVENTION: SERINE PROTEASE POLYPEPTIDES 

AND MATERIALS AND METHODS FOR MAKING THEM 

(III) NUMBER OF SEQUENCES: 16 

(1v) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: ZymoGenetlcs , Inc. 

(B) STREET: 1201 Eastlake Avenue East 

(C) CITY: Seattle 

(D) STATE: WA 

(E) COUNTRY: USA 

(F) ZIP: 98102 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: FastSEQ for Windows Version 2.0 

(v1) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(v11) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 



(v1ii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Parker. Gary E 

(B) REGISTRATION NUMBER: 31.648 

(C) REFERENCE/DOCKET NUMBER: 97-16 

(1x) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 206-442-6673 

(B) TELEFAX: 206-442-6678 

(C) TELEX: 



(2) INFORMATION FOR SEQ ID N0:1: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1634 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 105. . .1280 
(D) OTHER INFORMATION: 

(A) NAME/KEY: Signal Sequence 

(B) LOCATION: 105. . .161 
(D) OTHER INFORMATION: 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

GGCACGAGGG GGAGCCGCGC GCTCTCTCCC GGCGCCCACA CCTGTCTGAG CGGCGCAGCG 
AGCCGCGGCC CGGGCGGGCT GCTCGGCGCG GAACAGTGCT CGGC ATG GCA GGG ATT 

Met Ala Gly He 



CCA GGG CTC CTC TTC CTT CTC TTC TTT CTG CTC TGT GCT GTT GGG CAA 
Pro Gly Leu Leu Phe Leu Leu Phe Phe Leu Leu Cys Ala Val Gly Gin 
-15 -10 -5 1 

GTG AGO CCT TAC AGT GCC CCC TGG AAA CCC ACT TGG CCT GCA TAC CGC 
Val Ser Pro Tyr Ser Ala Pro Trp Lys Pro Thr Trp Pro Ala Tyr Arg 
5 10 15 

CTC CCT GTC GTC TTG CCC CAG TCT ACC CTC AAT TTA GCC AAG CCA GAC 
Leu Pro Val Val Leu Pro Gin Ser Thr Leu Asn Leu Ala Lys Pro Asp 
20 25 30 

TTT GGA GCC GAA GCC AAA TTA GAA GTA TCT TCT TCA TGT GGA CCC CAG 
Phe Gly Ala Glu Ala Lys Leu Glu Val Ser Ser Ser Cys Gly Pro Gin 
35 40 45 



TGT CAT AAG GGA ACT CCA CTG CCC ACT TAC AAA GAA GCC AAG CAA TAT 
Cys His Lys Gly Thr Pro Leu Pro Thr Tyr Lys Glu Ala Lys Gin Tyr 
50 55 60 65 



CTG TCT TAT GAA ACG CTC TAT GCC AAT GGC AGC CGC ACA GAG ACN CAG 
Leu Ser Jyr Glu Thr Leu Tyr Ala Asn Gly Ser Arg Thr Glu Xaa Gin 
70 75 80 



404 



GTG GGC ATC TAG ATC CTC AGC AGT AGT GGA GAT GGG GCC CAN CNC CGA 
Va1 Gly lie Tyr He Leu Ser Ser Ser Gly Asp Gly Ala Xaa Xaa Arg 
85 90 95 



452 



GAC TCA GGG TCT TCA GGA AAG TCT CGA AGG AAG CGG CAG ATT TAT GGC 
Asp Ser Gly Ser Ser Gly Lys Ser Arg Arg Lys Arg Gin He Tyr Gly 
100 105 110 



500 



TAT GAC AGC AGG TTC AGC ATT TTT GGG AAG GAC TTC CTG CTC AAC TAC 
Tyr Asp Ser Arg Phe Ser He Phe Gly Lys Asp Phe Leu Leu Asn Tyr 
115 120 125 



548 



CCT TTC TCA ACA TCA GTG AAG TTA TCC ACG GGC TGC ACC GGC ACC CTG 
Pro Phe Ser Thr Ser Val Lys Leu Ser Thr Gly Cys Thr Gly Thr Leu 
130 135 140 145 



596 



GTG GCA GAA AAN CAT GTC CTC ACA GCT GCC CAC TGC ATA CAC GAT GGA 
Val Ala Glu Xaa His Val Leu Thr Ala Ala His Cys He His Asp Gly 
150 155 160 



644 



AAA ACC TAT GTG AAA GGA ACC CAG AAG CTT CGA GTC GGC TTC CTA AAG 
Lys Thr Tyr Val Lys Gly Thr Gin Lys Leu Arg Val Gly Phe Leu Lys 
165 170 175 



692 



CCC AAG TTT AAA GAT GGT GGT CGA GGG GCC AAC GAC TCC ACT TCA GCC 
Pro Lys Phe Lys Asp Gly Gly Arg Gly Ala Asn Asp Ser Thr Ser Ala 
180 185 190 



740 



ATG CCC GAG CAG ATG AAA TTT CAG TGG ATC CGG GTG AAA CGC ACC CAT 
Met Pro Glu Gin Met Lys Phe Gin Trp He Arg Val Lys Arg Thr His 
195 200 205 



788 



GTG CCC AAG GGT TGG ATC AAG GGC AAT GCC AAT GAC ATC GGC ATG GAT 
Val Pro Lys Gly Trp He Lys Gly Asn Ala Asn Asp He Gly Met Asp 
210 215 220 225 



836 



TAT GAT TAT GCC CTC CTG GAA CTC AAA AAG CCC CAC AAG AGA AAA TTT 
Tyr Asp Tyr Ala Leu Leu Glu Leu Lys Lys Pro His Lys Arg Lys Phe 
230 235 240 



884 



ATG AAG ATT GGG GTG AGC CCT CCT GCT AAG CAG CTG CCA GGG GGC AGA 932 
Met Lys He Gly Val Ser Pro Pro Ala Lys Gin Leu Pro Gly Gly Arg 
245 250 255 

ATT CAC TTC TCT GGT TAT GAC AAT GAG CGA CCA GGC AAT TTG GTG TAT 980 
lie His Phe Ser Gly Tyr Asp Asn Asp Arg Pro Gly Asn Leu Val Tyr 
260 265 270 

CGC TTC TGT GAC GTC AAA GAC GAG ACC TAT GAC TTG TTG TAC CAG CAA 1028 
Arg Phe Cys Asp Val Lys Asp Glu Thr Tyr Asp Leu Leu Tyr Gin Gin 
275 280 285 

TGC GAT GCC CAG CCA GGG GCC AGC GGG TAT GGG GTA TAT GTG AGG ATG 1076 
Cys Asp Ala Gin Pro Gly Ala Ser Gly Tyr Gly Val Tyr Val Arg Met 
290 295 300 305 



;S TGG AAG AGA CAG CAG CAG AAG TGG GAG CGA AAA ATT ATT GGC ATT TTT 1124 

Trp Lys Arg Gin Gin Gin Lys Trp Glu Arg Lys He He Gly He Phe 
310 315 320 

TCA GGG CAC CAG TGG GTG GAC ATG AAT GGT TCC CCA CAG GAT TTC AAC 1172 
Ser Gly His Gin Trp Val Asp Met Asn Gly Ser Pro Gin Asp Phe Asn 
325 330 335 

GTG GCT GTC AGA ATC ACT CCT CTC AAA TAT GCC CAG ATC TGC TAT TGG 1220 
Val Ala Val Arg He Thr Pro Leu Lys Tyr Ala Gin He Cys Tyr Trp 
340 345 350 

ATT AAA GGA AAC TAC CTG GAT TGT AGG GAG GGT GAC AC A GTG TTC CTT 1268 
He Lys Gly Asn Tyr Leu Asp Cys Arg Glu Gly Asp Thr Val Phe Leu 
355 360 365 

CCT GGC AGC AAT TAAGGTCTTC ATGTTCTTAT TTTAGGAGAG GCCAAATTGT TTTTT 1325 
Pro Gly Ser Asn 
370 

GTCATTGGCG TGCACACGTG TGTGTGTGTG TGTGTGTGTG TGTAAGGTGT CTTATAATCT 1385 

TTTACCTATT TCTTACAATT GCAAGATGAC TGGCTTTACT ATTTGAAAAC TGGTTTGTGT 1445 

ATCATATCAT ATATCATTTA AGCAGTTTGA AGGCATACTT TTGCATAGAA ATAAAAAAAA 1505 

TACTGATTTG GGGCAATGAG GAATATTTGA CAATTAAGTT AATCTTCACG TTTTTGCAAA 1565 

CTTTGATTTT TATTTCATCT GAACTTGTTT CAAAGATTTA TAHAAATAT TTGGCATACA 1625 

AGAGATATG 1634 

(2) INFORMATION FOR SEQ ID NO: 2: 



(i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 392 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii ) MOLECULE TYPE: protein 
(V) FRAGMENT TYPE: internal 
(ix) FEATURE: 

(A) NAME/KEY: Signal Sequence 

(B) LOCATION: 1. . .19 
(D) OTHER INFORMATION: 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:2: 



Met 


Ala 


Gly 


He 


Pro 


Gly 


Leu 


Leu 


Phe 


Leu 


Leu 


Phe Phe Leu Leu Cys 










-15 










-10 




-5 


Ala 


Val 


Gly 


Gin 


Val 


Ser 


Pro 


Tyr 


Ser 


Ala 


Pro 


Trp Lys Pro Thr Trp 








1 








5 








10 


Pro 


Ala 


Tyr 


Arg 


Leu 


Pro 


Val 


Val 


Leu 


Pro 


Gin 


Ser Thr Leu Asn Leu 




15 










20 










25 


Ala 


Lys 


Pro 


Asp 


Phe 


Gly 


Ala 


Glu 


Ala 


Lys 


Leu 


Glu Val Ser Ser Ser 


on 
oU 










3b 










40 


45 


Cys 


Gly 


Pro 


Gin 


Cys 


His 


Lys 


Gly 


Thr 


Pro 


Leu 


Pro Thr Tyr Lys Glu 










50 










55 




60 


Ala 


Lys 


Gin 


Tyr 


Leu 


Ser 


Tyr 


Glu 


Thr 


Leu 


Tyr 


Ala Asn Gly Ser Arg 








65 










70 






75 


Thr 


Glu 


Xaa 


Gin 


Val 


Gly 


He 


Tyr 


He 


Leu 


Ser 


Ser Ser Gly Asp Gly 






80 










85 








90 


Ala 


Xaa 


Xaa 


Arg 


Asp 


Ser 


Gly 


Ser 


Ser 


Gly 


Lys 


Ser Arg Arg Lys Arg 




95 










100 










105 


Gin 


He 


Tyr 


Gly 


Tyr 


Asp 


Ser 


Arg 


Phe 


Ser 


He 


Phe Gly Lys Asp Phe 


110 










115 










120 


125 


Leu 


Leu 


Asn 


Tyr 


Pro 


Phe 


Ser 


Thr 


Ser 


Val 


Lys 


Leu Ser Thr Gly Cys 










130 










135 




140 


Thr 


Gly 


Thr 


Leu 


Val 


Ala 


Glu 


Xaa 


His 


Val 


Leu 


Thr Ala Ala His Cys 








145 










150 






155 


He 


His 


Asp 


Gly 


Lys 


Thr 


Tyr 


Val 


Lys 


Gly 


Thr 


Gin Lys Leu Arg Val 






160 










165 








170 


Gly 


Phe 


Leu 


Lys 


Pro 


Lys 


Phe 


Lys 


Asp 


Gly 


Gly 


Arg Gly Ala Asn Asp 




175 










180 










185 


Ser 


Thr 


Ser 


Ala 


Met 


Pro 


Glu 


Gin 


Met 


Lys 


Phe 


Gin Trp He Arg Val 


190 










195 










200 


205 


Lys 


Arg 


Thr 


His 


Val 


Pro 


Lys 


Gly 


Trp 


He 


Lys 


Gly Asn Ala Asn Asp 



210 215 220 



\ 



6 



Tin 
i 1 Q 


b ly 


riet 


Asp 


Tyr 


Asp Tyr 


Ala Leu Leu Glu Leu Lys Lys Pro His 








225 






oo n OOP 

230 235 


Lys 


Arg 


Lys 


rne 


net 


Lys i 1 e 


biy val Ser Pro Pro Ala Lys bin Leu 






24U 








245 250 


pro 


b ly 


b ly 


Arg 


i 1 e 


His rne 


Ser Gly Tyr Asp Asn Asp Arg Pro Gly 




on c 

255 








260 


265 


Asn 


Leu 


Va 1 


Tyr 


Arg 


Phe Cys 


Asp Val Lys Asp Glu Thr Tyr Asp Leu 


o"7 r\ 










275 


o r> A o o r" 

280 285 


Leu 


Tyr 


b 1 n 


b 1 n 


Cys 


Asp Al a 


Gin Pro Gly Ala Ser Gly Tyr Gly Val 










290 




00/~ OAA 

295 300 


Tyr 


\/-. 1 

Va 1 


Arg 


Met 


Trp 


Lys Arg 


Gin Gin Gin Lys Trp Glu Arg Lys He 








O A f~ 

305 






O 1 A O 1 l~ 

310 315 


He 


Gly 


He 


Phe 


Ser 


Gly His 


Gin Trp Val Asp Met Asn Gly Ser Pro 






320 








325 330 


Gin 


Asp 


Phe 


Asn 


Val 


Ala Val 


Arg He Thr Pro Leu Lys Tyr Ala Gin 




335 








340 


345 


He 


Cys 


Tyr 


Trp 


He 


Lys Gly 


Asn Tyr Leu Asp Cys Arg Glu Gly Asp 


350 










355 


360 365 


Thr 


Val 


Phe 


Leu 


Pro 


Gly Ser 


Asn 










370 







(2) INFORMATION FOR SEQ ID N0:3: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
TGYACNGGNW SNHTNRT 17 
(2) INFORMATION FOR SEQ ID N0:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 
AYNADNSWNC CNGTRCA 



17 



(2) INFORMATION FOR SEQ ID NO: 5: 



(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:5 
ACNGCNGSNC AYTGYAT 

(2) INFORMATION FOR SEQ ID N0:6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:6 
ATRCARTGNS CNGCNGT 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7 
WYRTNCCNWV NGGNTGG 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 



8 



(D) TOPOLOGY: linear 



(x1) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
CCANCCNBWN GGNAYRW 

(2) INFORMATION FOR SEQ ID N0:9: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:9: 
AYNRAYTAYG AYTAYGS 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10 
SCRTARTCRT ARTYNRT 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: ZC11667 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11 



TATGCAGGCC AAGTGGGTTT CCAGGGGGCA CTGTAAGGGC 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic add 

(C) STRANOEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: ZC13508 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:12 
TCTGCTCTGT GCTGTTGG 

(2) INFORMATION FOR SEQ ID NO: 13: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii ) MOLECULE TYPE: Other 
(vii ) IMMEDIATE SOURCE: 
(B) CLONE: ZC13509 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:13 

AGTCTGGCTT GGCTAAAT 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1656 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS; single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(ix) FEATURE: 



# 



10 



(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 105. . . 1280 
(D) OTHER INFORMATION: 

(A) NAME/KEY: Signal Sequence 

(B) LOCATION: 105... 161 
(D) OTHER INFORMATION: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

GGCACGAGGG GGAGCCGCGC GCTCTCTCCC GGCGCCCACA CCTGTCTGAG CGGCGCAGCG 60 
AGCCGCGGCC CGGGCGGGCT GCTCGGCGCG GAACAGTGCT CGGC ATG GCA GGG ATT 116 

Met Ala Gly He 



CCA GGG CTC CTC TTC CTT CTC TTC TTT CTG CTC TGT GCT GTT GGG CAA 164 
Pro Gly Leu Leu Phe Leu Leu Phe Phe Leu Leu Cys Ala Val Gly Gin 
-15 -10 -5 1 

GTG AGC CCT TAC AGT GCC CCC TGG AAA CCC ACT TGG CCT GCA TAC CGC 212 
Val Ser Pro Tyr Ser Ala Pro Trp Lys Pro Thr Trp Pro Ala Tyr Arg 
5 10 15 

CTC CCT GTC GTC TTG CCC CAG TCT ACC CTC AAT TTA GCC AAG CCA GAC 260 
Leu Pro Val Val Leu Pro Gin Ser Thr Leu Asn Leu Ala Lys Pro Asp 
20 25 30 

TTT GGA GCC GAA GCC AAA TTA GAA GTA TCT TCT TCA TGT GGA CCC CAG 308 
Phe Gly Ala Glu Ala Lys Leu Glu Val Ser Ser Ser Cys Gly Pro Gin 
35 40 45 

TGT CAT AAG GGA ACT CCA CTG CCC ACT TAC GAA GAG GCC AAG CAA TAT 356 
Cys His Lys Gly Thr Pro Leu Pro Thr Tyr Glu Glu Ala Lys Gin Tyr 
50 55 60 65 

CTG TCT TAT GAA ACG CTC TAT GCC AAT GGC AGC CGC ACA GAG ACG CAG 404 
Leu Ser Tyr Glu Thr Leu Tyr Ala Asn Gly Ser Arg Thr Glu Thr Gin 
70 75 80 

GTG GGC ATC TAC ATC CTC AGC AGT AGT GGA GAT GGG GCC CAA CAC CGA 452 
Val Gly He Tyr He Leu Ser Ser Ser Gly Asp Gly Ala Gin His Arg 
85 90 95 

GAC TCA GGG TCT TCA GGA AAG TCT CGA AGG AAG CGG CAG ATT TAT GGC 500 
Asp Ser Gly Ser Ser Gly Lys Ser Arg Arg Lys Arg Gin He Tyr Gly 
100 105 110 



11 



TAT GAC AGC AGG TTC AGC ATT TTT GGG AAG GAC TTC CTG CTC AAC TAG 548 
Tyr Asp Ser Arg Phe Ser He Phe Gly Lys Asp Phe Leu Leu Asn Tyr 
115 120 125 

OCT TTC TCA ACA TCA GTG AAG TTA TCC ACG GGC TGC ACC GGC AGO CTG 596 
Pro Phe Ser Thr Ser Va1 Lys Leu Ser Thr Gly Cys Thr Gly Thr Leu 
130 135 140 • 145 

GTG GCA GAG AAG CAT GTC CTC ACA GCT GCC CAC TGC ATA CAC GAT GGA 644 
Val Ala Glu Lys His Val Leu Thr Ala Ala His Cys He His Asp Gly 
150 155 160 

AAA ACC TAT GTG AAA GGA ACC CAG AAG CTT CGA GTG GGC TTC CTA AAG 692 
Lys Thr Tyr Val Lys Gly Thr Gin Lys Leu Arg Val Gly Phe Leu Lys 
165 170 175 

CCC AAG TTT AAA GAT GGT GGT CGA GGG GCC AAC GAC TCC ACT TCA GCC 740 
Pro Lys Phe Lys Asp Gly Gly Arg Gly Ala Asn Asp Ser Thr Ser Ala 
180 185 190 

ATG CCC GAG CAG ATG AAA TTT CAG TGG ATC CGG GTG AAA CGC ACC CAT 788 
Met Pro Glu Gin Met Lys Phe Gin Trp He Arg Val Lys Arg Thr His 
195 200 205 

GTG CCC AAG GGT TGG ATC AAG GGC AAT GCC AAT GAC ATC GGC ATG GAT 836 
Val Pro Lys Gly Trp He Lys Gly Asn Ala Asn Asp He Gly Met Asp 
210 215 220 225 

TAT GAT TAT GCC CTC CTG GAA CTC AAA AAG CCC CAC AAG AGA AAA TTT 884 
Tyr Asp Tyr Ala Leu Leu Glu Leu Lys Lys Pro His Lys Arg Lys Phe 
230 235 240 

ATG AAG ATT GGG GTG AGC CCT CCT GCT AAG CAG CTG CCA GGG GGC AGA 932 
Met Lys He Gly Val Ser Pro Pro Ala Lys Gin Leu Pro Gly Gly Arg 
245 250 255 

ATT CAC TTC TCT GGT TAT GAC AAT GAC CGA CCA GGC AAT TTG GTG TAT 980 
He His Phe Ser Gly Tyr Asp Asn Asp Arg Pro Gly Asn Leu Val Tyr 
260 265 270 

CGC TTC TGT GAC GTC AAA GAC GAG ACC TAT GAC TTG CTC TAC CAG CAA 1028 
Arg Phe Cys Asp Val Lys Asp Glu Thr Tyr Asp Leu Leu Tyr Gin Gin 
275 280 285 



12 



TGC GAT GCC GAG CCA GGG GCC AGC GGG TCT GGG GTC TAT GTG AGG ATG 1076 
Cys Asp Ala Gin Pro Gly Ala Ser Gly Ser Gly Val Tyr Val Arg Met 
290 295 300 305 

TGG AAG AGA CAG CAG CAG AAG TGG GAG CGA AAA ATT ATT GGG ATT TTT 1124 
Trp Lys Arg Gin Gin Gin Lys Trp Glu Arg Lys He He Gly He Phe 
310 315 320 

TCA GGG CAC CAG TGG GTG GAC ATG AAT GGT TCC CCA CAG GAT TTC AAC 1172 
Ser Gly His Gin Trp Val Asp Met Asn Gly Ser Pro Gin Asp Phe Asn 
325 330 335 

GTG GCT GTC AGA ATC ACT CCT CTC AAA TAT GCC CAG ATC TGC TAT TGG 1220 
Val Ala Val Arg He Thr Pro Leu Lys Tyr Ala Gin He Cys Tyr Trp 
340 345 350 

ATT AAA GGA AAC TAG CTG GAT TGT AGG GAG GGT GAC ACA GTG TTC CCT 1268 
He Lys Gly Asn Tyr Leu Asp Cys Arg Glu Gly Asp Thr Val Phe Pro 
355 360 365 

CCT GGC AGC AAT TAAGGTCTTC ATGTTCTTAT TTTAGGAGAG GCCAAATTGT TTTTT 1325 

Pro Gly Ser Asn 

370 

GTCATTGGCG TGCACACGTG TGTGTGTGTG TGTGTGTGTG TGTAAGGTGT CTTATAATCT 1385 

TTTACCTATT TCTTACAATT GCAAGATGAC TGGCTTTACT ATTTGAAAAC TGGTTTGTGT 1445 

ATCATATCAT ATATCATTTA AGCAGTTTGA AGGCATACTT TTGCATAGAA ATAAAAAAAA 1505 

TACTGATTTG GGGCAATGAG GAATATTTGA CAATTAAGTT AATCTTCACG TTTTTGCAAA 1565 

CTTTGATTTT TATTTCATCT GAACTTGTTT CAAAGATTTA TATTAAATAT TTGGCATACA 1625 

AGAGATATGA AAAAAAAAAA AAAAAAAAAA A 1656 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 392 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(V) FRAGMENT TYPE: internal 
(ix) FEATURE: 



(A) NAME/KEY: Signal Sequence 

(B) LOCATION: 1. . .19 
(D) OTHER INFORMATION: 



(x1) SEQUENCE DESCRIPTION: SEQ ID N0:15: 





Met 


Ala 


Gly 


1 le 


Pro 


Gly 


Leu 


1 « . . 

Leu 


Phe 


Leu Leu 


Phe 


Phe Leu 


Leu Cys 












-15 










T A 

-10 






-5 




AT-, 

Al a 


Val 


Gly 


Gin 


Val 


Ser 


Pro 


-1- 

Tyr 


Ser 


Ala Pro 


Trp 


Lys Pro 


Thr Trp 










1 








5 








10 






Pro 


Ala 


Tyr 


Arg 


Leu 


Pro 


Val 


Val 


Leu 


Pro Gin 


Ser 


Thr Leu 


Asn Leu 






15 










OA 

20 








25 








Al a 


Lys 


Pro 


A _ ^ 

Asp 


Phe 


Gly 


Ala 


Gl u 


A 1 

Al a 


Lys Leu 


Gl u 


Val Ser 


Ser Ser 




30 










35 








40 






45 




Cys 


Gly 


Pro 


Gin 


Cys 


His 


Lys 


Gly 


Thr 


Pro Leu 


Pro 


Thr Tyr 


Glu Glu 












50 










55 






60 




Ala 


Lys 


Gin 


Tyr 


Leu 


Ser 


Tyr 


Gl u 


Thr 


Leu Tyr 


Ala 


Asn Gly 


Ser Arg 










65 










70 






75 






Thr 


Glu 


Thr 


Gin 


Val 


Gly 


He 


Tyr 


He 


Leu Ser 


Ser 


Ser Gly 


Asp Gly 








80 










85 








90 






Ala 


Gin 


His 


Arg 


Asp 


Ser 


Gly 


Ser 


Ser 


Gly Lys 


Ser 


Arg Arg 


Lys Arg 


|u 




95 










100 








105 








Gin 


He 


Tyr 


Gly 


Tyr 


Asp 


Ser 


Arg 


Phe 


Ser He 


Phe 


Gly Lys 


Asp Phe 




110 










115 








120 






125 


: "rS 


Leu 


Leu 


Asn 


Tyr 


Pro 


Phe 


Ser 


Thr 


Ser 


Val Lys 


Leu 


Ser Thr 


Gly Cys 


!=! 










130 










135 






140 




Thr 


Gly 


Thr 


Leu 


Val 


Ala 


Glu 


Lys 


His 


Val Leu 


Thr 


Ala Ala 


His Cys 










145 










150 






155 






T T — 

lie 


His 


Asp 


Gly 


Lys 


Thr 


Tyr 


Val 


Lys 


Gly Thr 


Gin 


Lys Leu 


Arg Val 








160 










165 








170 






Gly 


Phe 


Leu 


Lys 


Pro 


Lys 


Phe 


Lys 


Asp 


Gly Gly 


Arg 


Gly Ala 


Asn Asp 






175 










T O A 

180 








185 








Ser 


Thr 


Ser 


A n ^ 

Al a 


Met 


Pro 


Glu 


Gin 


Met 


Lys Phe 


Gin 


Trp He 


Arg Val 




iyu 










1 nc 
195 








20U 






O A C 

205 




Lys 


Arg 


Thr 


His 


Val 


Pro 


Lys 


Gly 


Trp 


He Lys 


Gly 


Asn Ala 


Asn Asp 












210 










215 






O O A 

220 




i 1 e 


Gly 


Met 


Asp 


Tyr 


Asp 


Tyr 


Ala 


Leu 


Leu Glu 


Leu 


Lys Lys 


Pro His 










ILC.0 






















Lys 


Arg 


Lys 


Phe 


Met 


Lys 


He 


Gly 


Val 


Ser Pro 


Pro 


Ala Lys 


Gin Leu 








240 










245 








250 






Pro 


Gly 


Gly 


Arg 


He 


His 


Phe 


Ser 


Gly 


Tyr Asp 


Asn 


Asp Arg 


Pro Gly 






255 










260 








265 








Asn 


Leu 


Val 


Tyr 


Arg 


Phe 


Cys 


Asp 


Val 


Lys Asp 


Glu 


Thr Tyr 


Asp Leu 




270 










275 








280 






285 




Leu 


Tyr 


Gin 


Gin 


Cys 


Asp 


Ala 


Gin 


Pro 


Gly Ala 


Ser 


Gly Ser 


Gly Val 












290 










295 






300 




Tyr 


Val 


Arg 


Met 


Trp 


Lys 


Arg 


Gin 


Gin 


Gin Lys 


Trp 


Glu Arg 


Lys He 



305 310 315 



♦ ♦ 

14 



He Gly He Phe Ser Gly His Gin Trp Val Asp Met Asn Gly Ser Pro 

320 325 330 

Gin Asp Phe Asn Val Ala Val Arg He Thr Pro Leu Lys Tyr Ala Gin 

335 340 345 , 
He Cys Tyr Trp He Lys Gly Asn Tyr Leu Asp Cys Arg Glu Gl y/ Asp 
350 355 360 " t365 
T hr Val Phe Pro Pro Gl y.^$&P-As-n- 
370 

(2) INFORMATION FOR SEO ID NO: 16: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1176 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 





(xi) 


SEQUENCE DESCRIPTION: 


SEQ ID N0:16: 






"r" 


ATGGCNGGNA 


THCCNGGNYT 


NYTNTTYYTN 


YTNTTYTTYY 


TNYTNTGYGC 


NGTNGGNCAR 


60 




GTNWSNCCNT 


AYWSNGCNCC 


NTGGAARCCN 


ACNTGGCCNG 


CNTAYMGNYT 


NCCNGTNGTN 


120 


■3 


YTNCCNCARW 


SNACNYTNAA 


YYTNGCNAAR 


CCNGAYTTYG 


GNGCNGARGC 


NAARYTNGAR 


180 




GTNWSNWSNW 


SNTGYGGNCC 


NCARTGYCAY 


AARGGNACNC 


CNYTNCCNAC 


NTAYGARGAR 


240 




GCNAARCART 


AYYTNWSNTA 


YGARACNYTN 


TAYGCNAAYG 


GNWSNMGNAC 


NGARACNCAR 


300 




GTNGGNATHT 


AYATHYTNWS 


NWSNWSNGGN 


GAYGGNGCNC 


ARCAYMGNGA 


YWSNGGNWSN 


360 




WSNGGNAARW 


SNMGNMGNAA 


RMGNCARATH 


TAYGGNTAYG 


AYWSNMGNTT 


YWSNATHTTY 


420 




GGNAARGAYT 


TYYTNYTNAA 


YTAYCCNTTY 


WSNACNWSNG 


TNAARYTNWS 


NACNGGNTGY 


480 




ACNGGNACNY 


TNGTNGCNGA 


RAARCAYGTN 


YTNACNGCNG 


CNCAYTGYAT 


HCAYGAYGGN 


540 




AARACNTAYG 


TNAARGGNAC 


NCARAARYTN 


MGNGTNGGNT 


TYYTNAARCC 


NAARTTYAAR 


600 




GAYGGNGGNM 


GNGGNGCNAA 


YGAYWSNACN 


WSNGCNATGC 


CNGARCARAT 


GAARTTYCAR 


660 




T6GATHMGNG 


TNAARMGNAC 


NCAYGTNCCN 


AARGGNTGGA 


THAARGGNAA 


YGCNAAYGAY 


720 




ATHGGNATGG 


AYTAYGAYTA 


YGCNYTNYTN 


GARYTNAARA 


ARCCNCAYAA 


RMGNAARTTY 


780 




ATGAARATHG 


GNGTNWSNCC 


NCCNGCNAAR 


CARYTNCCNG 


GNGGNMGNAT 


HCAYTTYWSN 


840 




GGNTAYGAYA 


AYGAYMGNCC 


NGGNAAYYTN 


GTNTAYMGNT 


TYTGYGAYGT 


NAARGAYGAR 


900 




ACNTAYGAYY 


TNYTNTAYCA 


RCARTGYGAY 


GCNCARCCNG 


GNGCNWSNGG 


NWSNGGNGTN 


960 




TAYGTNMGNA 


TGTGGAARMG 


NCARCARCAR 


AARTGGGARM 


GNAARATHAT 


HGGNATHTTY 


1020 




WSNGGNCAYC 


ARTGGGTNGA 


YATGAAYGGN 


WSNCCNCARG 


AYTTYAAYGT 


NGCNGTNMGN 


1080 




ATHACNCCNY 


TNAARTAYGC 


NCARATHTGY 


TAYTGGATHA 


ARGGNAAYTA 


YYTNGAYTGY 


1140 




MGNGARGGNG 


AYACNGTNTT 


YCCNCCNGGN 


WSNAAY 






1176 



